Support vector machine classification for large datasets using decision tree and Fisher linear discriminant

نویسندگان

  • Asdrúbal López Chau
  • Xiaoou Li
  • Wen Yu
چکیده

The training of a support vector machine (SVM) has a time complexity between O(n) and O(n). Most training algorithms for SVM are not suitable for large data sets. Decision trees can simplify SVM training, however the classification accuracy becomes lower when there are inseparable points. This paper introduces a novel method for SVM classification. A decision tree is used to detect low entropy regions in input space. We use Fisher’s linear discriminant to detect the data near to support vectors. Experimental results demonstrate that our approach has good classification accuracy and low standard deviation, the training is significantly faster than other training methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Group Classification Using Interval Linea rProgramming

  Among various statistical and data mining discriminant analysis proposed so far for group classification, linear programming discriminant analysis has recently attracted the researchers’ interest. This study evaluates multi-group discriminant linear programming (MDLP) for classification problems against well-known methods such as neural networks and support vector machine. MDLP is less compli...

متن کامل

Tumor Classification Using Support Vector Machines

In this study, a method for classification of tumor sample microarray data based on support vector machines is presented. Different possibilities for data processing, gene selection and support vector machine classification are recited. The performance of support vector machine classification is compared to that of linear discriminant analysis and decision tree -based classifiers.

متن کامل

Comparison of Parametric and Non-parametric EEG Feature Extraction Methods in Detection of Pediatric Migraine without Aura

Background: Migraine headache without aura is the most common type of migraine especially among pediatric patients. It has always been a great challenge of migraine diagnosis using quantitative electroencephalography measurements through feature classification. It has been proven that different feature extraction and classification methods vary in terms of performance regarding detection and di...

متن کامل

کاربرد الگوریتم‌های داده‌کاوی در تفکیک منابع رسوبی حوزۀ آبخیز نوده گناباد

Introduction: Reduction of sediment supply requires the implementation of soil conservation and sediment control programs in the form of watershed management plans. Sediment control programs require identifying the relative importance of sediment sources, their quantitative ascription and identification of critical areas within the watersheds. The sediment source ascription is involves two...

متن کامل

A prediction distribution of atmospheric pollutants using support vector machines, discriminant analysis and mapping tools (Case study: Tunisia)

Monitoring and controlling air quality parameters form an important subject of atmospheric and environmental research today due to the health impacts caused by the different pollutants present in the urban areas. The support vector machine (SVM), as a supervised learning analysis method, is considered an effective statistical tool for the prediction and analysis of air quality. The work present...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2014